Skip to content

Conversation

@weihanglo
Copy link
Member

@weihanglo weihanglo commented Nov 15, 2025

Add unnormalized_source_len field to track the byte length
of source files before normalization (the original length).

unnormalized_source_len is for writing the correct file length
to dep-info for -Zchecksum-hash-algorithm

Fixes #148934

@rustbot rustbot added A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 15, 2025
@rustbot
Copy link
Collaborator

rustbot commented Nov 15, 2025

r? @jieyouxu

rustbot has assigned @jieyouxu.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

self.checksum_hash.encode(s);
// Do not encode `start_pos` as it's global state for this session.
self.source_len.encode(s);
self.original_source_len.encode(s);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is really needed

Copy link
Member

@jieyouxu jieyouxu Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say if the normalized source len is encoded, then unnormalized source len should also be encoded. I would imagine that if the source file changed, then well, it did materially change.

(
escape_dep_filename(&fmap.name.prefer_local().to_string()),
fmap.source_len.0 as u64,
fmap.original_source_len as u64,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is the goal of the entire PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: can you please add a comment to elaborate here for locality? That is,

source_len -> original_source_len is very subtle.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added. Yeah it is clearer now.

Copy link
Member

@jieyouxu jieyouxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not an expert in this part of the compiler, however the changes make sense to me. A few "cosmetic" nits.

View changes since this review

(
escape_dep_filename(&fmap.name.prefer_local().to_string()),
fmap.source_len.0 as u64,
fmap.original_source_len as u64,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: can you please add a comment to elaborate here for locality? That is,

source_len -> original_source_len is very subtle.

Comment on lines 1728 to 1729
/// The byte length of this source before normalization.
pub original_source_len: u32,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: I'm not sure how invasive the change is, but we can rename this pair of source lengths (better names welcome):

  • source_len -> normalized_source_len
  • original_source_len -> unnormalized_source_len

When reading this diff, I find that source_len vs original_source_len is really not obvious. I can appreciate if the diffs is intentionally made small to make review easier, but I would prefer if we also do a rename for the source_len field for consistency -- this is really not obvious.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice suggestion. Done in a separate commit.

self.checksum_hash.encode(s);
// Do not encode `start_pos` as it's global state for this session.
self.source_len.encode(s);
self.original_source_len.encode(s);
Copy link
Member

@jieyouxu jieyouxu Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say if the normalized source len is encoded, then unnormalized source len should also be encoded. I would imagine that if the source file changed, then well, it did materially change.

@jieyouxu
Copy link
Member

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 15, 2025
@weihanglo weihanglo changed the title fix(span): track original source len for dep-info fix(span): track unnormalized source len for dep-info Nov 15, 2025
@weihanglo weihanglo requested a review from jieyouxu November 15, 2025 13:44
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 15, 2025
@rust-log-analyzer

This comment has been minimized.

@jieyouxu
Copy link
Member

Fun. clippy is another josh subtree, so it's okay to change r-l/r side since we obviously need to change the compiler tree here.

@weihanglo
Copy link
Member Author

At least Cargo doesn't have that, otherwise I'll need to do two more PRs 😬

@weihanglo
Copy link
Member Author

Oops typo

This is a preparation for introducing a unnormalized source length field
Add `unnormalized_source_len` field to track the byte length
of source files before normalization (the original length).

`unnormalized_source_len` is for writing the correct file length
to dep-info for `-Zchecksum-hash-algorithm`
@rustbot
Copy link
Collaborator

rustbot commented Nov 15, 2025

Some changes occurred in src/tools/clippy

cc @rust-lang/clippy

@rustbot rustbot added the T-clippy Relevant to the Clippy team. label Nov 15, 2025
Copy link
Member

@jieyouxu jieyouxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jieyouxu
Copy link
Member

@bors r+ rollup

@bors
Copy link
Collaborator

bors commented Nov 16, 2025

📌 Commit cf57b9b has been approved by jieyouxu

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 16, 2025
Zalathar added a commit to Zalathar/rust that referenced this pull request Nov 16, 2025
fix(span): track unnormalized source len for dep-info

Add `unnormalized_source_len` field to track the byte length
of source files before normalization (the original length).

`unnormalized_source_len` is for writing the correct file length
to dep-info for `-Zchecksum-hash-algorithm`

Fixes rust-lang#148934
bors added a commit that referenced this pull request Nov 16, 2025
Rollup of 3 pull requests

Successful merges:

 - #145954 (stabilize extern_system_varargs)
 - #148962 (fix(span): track unnormalized source len for dep-info)
 - #148969 (compiletest: Don't apply "emscripten" directives to `wasm32-unknown-unknown`)

r? `@ghost`
`@rustbot` modify labels: rollup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-clippy Relevant to the Clippy team. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

-Zchecksum-hash-algorithm used normalized file size in dep-info

5 participants